DCU at the NTCIR-12 SpokenQuery&Doc-2 Task
نویسندگان
چکیده
We describe DCU’s participation in the NTCIR-12 SpokenQuery&Doc (SQD-2) task. In the context of the slide-group retrieval sub-task, we experiment with a passage retrieval method that re-scores each passage according to the relevance score of the document from which the passage is taken. This is performed by linearly interpolating their relevance scores which are calculated using the Okapi BM25 model of probabilistic retrieval for passages and documents independently. In conjunction with this, we assess the benefits of using pseudo-relevance feedback for expanding the textual representation of the spoken queries with terms found in the top-ranked documents and passages, and experiment with a general multidimensional optimisation method to jointly tune the BM25 and query expansion parameters with queries and relevance data from the NTCIR-11 SQD-1 task. Retrieval experiments performed over the SQD-1 and SQD-2 queries confirm previous findings which affirm that integrating document information when ranking passages can lead to improved passage retrieval effectiveness. Furthermore, results indicate that no significant gains in retrieval effectiveness can be obtained by using query expansion in combination with our retrieval models over these two query sets.
منابع مشابه
DCU at the NTCIR-11 SpokenQuery&Doc Task
We describe DCU’s participation in the NTCIR-11 SpokenQuery&Document task. We participated in the spokenquery spoken content retrieval (SQ-SCR) subtask by using the slide group segments as basic indexing and retrieval units. Our approach integrates normalised prosodic features into a standard BM25 weighting function to increase weights for terms that are prominent in speech. Text queries and re...
متن کاملOverview of the NTCIR-12 SpokenQuery&Doc-2 Task
This paper presents an overview of the Spoken Query and Spoken Document retrieval (SpokenQuery&Doc-2) task at the NTCIR-12 Workshop. This task included spoken query driven spoken content retrieval (SQ-SCR) and a spoken query driven spoken term detection (SQ-STD) as the two subtasks. The paper describes details of each sub-task, the data used, the creation of the speech recognition systems used ...
متن کاملSpoken Document Retrieval Experiments for SpokenQuery&Doc at Ryukoku University (RYSDT)
In this paper, we describe spoken document retrieval (SDR) systems in Ryukoku University, which were participated in NTCIR-11 “SpokenQuery&Doc” task. In NTCIR-11 SpokenQuery&Doc task, there are subtasks: “spoken content retrieval (SCR) subtask” and “spoken term detection (STD) subtask”. We participated in the SCR and STD subtasks as team RYSDT. In this paper, our SDR and STD systems are described.
متن کاملUB at the NTCIR-12 SpokenQuery&Doc-2: Spoken Content Retrieval Using Multiple ASR Hypotheses and Syllables
The University at Buffalo (UB) team participated in the SpokenQuery&Doc task at the NTCIR-12, working on the Spoken Content Retrieval (SCR) subtask. We investigated the use of multiple ASR hypotheses (words) and subword units (syllables) for improving retrieval effectiveness. We also compared the retrieval effectiveness based on texts generated by two automatic speech recognition (ASR) engines,...
متن کاملEvaluation of DNN-based Phoneme Estimation Approach on the NTCIR-12 SpokenQuery&Doc-2 SQ-STD Subtask
This paper proposes a correct phoneme sequence estimation method using a deep neural network (DNN)-based framework for spoken term detection (STD). We use a DNN architecture as a correct phoneme estimator. The DNN-based estimator estimates a correct phoneme sequence of an utterance from some sorts of phoneme-based transcriptions produced by multiple ASR systems in post-processing, for reducing ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016